Mining Concept-Drifting Data Stream to Detect Peer to Peer Botnet Traffic
نویسندگان
چکیده
We propose a novel stream data classification technique to detect Peer to Peer botnet. Botnet traffic can be considered as stream data having two important properties: infinite length and drifting concept. Thus, stream data classification technique is more appealing to botnet detection than simple classification technique. However, no other botnet detection approaches so far have applied stream data classification technique. We propose a multi-chunk, multi-level ensemble classifier based data mining technique to classify concept-drifting stream data. Previous ensemble techniques in classifying concept-drifting stream data use a single data chunk to train a classifier. In our approach, we train an ensemble of v classifiers from r consecutive data chunks. K of these v-classifier ensembles are used to build another level of ensemble. By introducing this multi-chunk, multi-level ensemble, we significantly reduce error compared to the singlechunk, single level ensemble. We have established the justification of using our algorithm theoretically. We have also tested our technique on both botnet traffic and simulated data, and obtained better detection accuracies compared to other published works.
منابع مشابه
Data mining for security applications: Mining concept-drifting data streams to detect peer to peer botnet traffic
There has been much interest on using data mining for counter-terrorism and cyber security applications. For example, data mining can be used to detect unusual patterns, terrorist activities and fraudulent behavior. In addition data mining can also be sued for intrusion detection and malicious code detection. Our current research is focusing extensively on data mining for security applications ...
متن کاملA Multi-partition Multi-chunk Ensemble Technique to Classify Concept-Drifting Data Streams
We propose a multi-partition, multi-chunk ensemble classifier based data mining technique to classify concept-drifting data streams. Existing ensemble techniques in classifying concept-drifting data streams follow a single-partition, single-chunk approach, in which a single data chunk is used to train one classifier. In our approach, we train a collection of v classifiers from r consecutive dat...
متن کاملBotTrack: Tracking Botnets Using NetFlow and PageRank
With large scale botnets emerging as one of the major current threats, the automatic detection of botnet traffic is of high importance for service providers and large campus network monitoring. Faced with high speed network connections, detecting botnets must be efficient and accurate. This paper proposes a novel approach for this task, where NetFlow related data is correlated and a host depend...
متن کاملLearning to Rank from Concept-Drifting Network Data Streams
Networked data are, nowadays, collected in various application domains such as social networks, biological networks, sensor networks, spatial networks, peer-to-peer networks etc. Recently, the application of data stream mining to networked data, in order to study their evolution over time, is receiving increasing attention in the research community. Following this main stream of research, we pr...
متن کاملA Framework For Concept Drifting P2P Traffic Identification
Identification of network traffic using port-based or payload-based analysis is becoming increasing difficult with many Peer-to-Peer (P2P) application using dynamic ports, masquerading techniques, and encryption to avoid detection. To overcome this problem, several machine learning technique were proposed to classify P2P traffics. But in the real P2P network environment, new communities of peer...
متن کامل